Deterministic Feature Selection for K-Means Clustering
نویسندگان
چکیده
منابع مشابه
Unsupervised Feature Selection for the $k$-means Clustering Problem
We present a novel feature selection algorithm for the k-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter ε ∈ (0, 1), selects and appropriately rescales in an unsupervised manner Θ(k log(k/ε)/ε) features from a dataset of arbitrary dimensions. We prove that, if we run any γ-approximate k-means algorithm (γ ≥ 1) on the features selected using our method, ...
متن کاملSelection of K in K-means clustering
The K-means algorithm is a popular data-clustering algorithm. However, one of its drawbacks is the requirement for the number of clusters, K, to be specified before the algorithm is applied. This paper first reviews existing methods for selecting the number of clusters for the algorithm. Factors that affect this selection are then discussed and a new measure to assist the selection is proposed....
متن کاملK-means Clustering with Feature Hashing
One of the major problems of K-means is that one must use dense vectors for its centroids, and therefore it is infeasible to store such huge vectors in memory when the feature space is high-dimensional. We address this issue by using feature hashing (Weinberger et al., 2009), a dimension-reduction technique, which can reduce the size of dense vectors while retaining sparsity of sparse vectors. ...
متن کاملCrack Fault Classification for Planetary Gearbox Based on Feature Selection Technique and K-means Clustering Method
During the condition monitoring of a planetary gearbox, features are extracted from raw data for a fault diagnosis. However, different features have different sensitivity for identifying different fault types, and thus, the selection of a sensitive feature subset from an entire feature set and retaining as much of the class discriminatory information as possible has a directly effect on the acc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Information Theory
سال: 2013
ISSN: 0018-9448,1557-9654
DOI: 10.1109/tit.2013.2255021